18 research outputs found

    Forecasting Carbon Dioxide Emission in Thailand Using Machine Learning Techniques

    Get PDF
    Machine Learning (ML) models and the massive quantity of data accessible provide useful tools for analyzing the advancement of climate change trends and identifying major contributors. Random Forest (RF), Gradient Boosting Regression (GBR), XGBoost (XGB), Support Vector Machines (SVC), Decision Trees (DT), K-Nearest Neighbors (KNN), Principal Component Analysis (PCA), ensemble methods, and Genetic Algorithms (GA) are used in this study to predict CO2 emissions in Thailand. A variety of evaluation criteria are used to determine how well these models work, including R-squared (R2), mean absolute error (MAE), root mean squared error (RMSE), mean absolute percentage error (MAPE), and correctness.  The results show that the RF and XGB algorithms function exceptionally well, with high R-squared values and low error rates.  KNN, PCA, ensemble methods, and GA, on the other hand, outperform the top-performing models. Their lower R-squared values and higher error scores indicate that they are unable to accurately anticipate CO2 emissions. This paper contributes to the field of environmental modeling by comparing the effectiveness of various machine learning approaches in forecasting CO2 emissions. The findings can assist Thailand in promoting sustainable development and developing policies that are consistent with worldwide efforts to combat climate change

    Rough Sets Clustering and Markov model for Web Access Prediction

    Get PDF
    Discovering user access patterns from web access log is increasing the importance of information to build up adaptive web server according to the individual user’s behavior. The variety of user behaviors on accessing information also grows, which has a great impact on the network utilization. In this paper, we present a rough set clustering to cluster web transactions from web access logs and using Markov model for next access prediction. Using this approach, users can effectively mine web log records to discover and predict access patterns. We perform experiments using real web trace logs collected from www.dusit.ac.th servers. In order to improve its prediction ration, the model includes a rough sets scheme in which search similarity measure to compute the similarity between two sequences using upper approximation

    Using Markov Model and Association Rules for Web Access Prediction

    Get PDF
    Mining user patterns of log file can provide significant and useful informative knowledge. A large amount of research has been done on trying to predict correctly the pages a user will request. This task requires the development of models that can predicts a user’s next request to a web server. In this paper, we propose a method for constructing first-order and second-order Markov models of Web site access prediction based on past visitor behavior and compare it association rules technique. This algorithm has been used to cluster similar transition behaviors for efficient used to further improve the efficiency of prediction. From this comparison we propose a best overall method and empirically test the proposed model on real web logs

    Anomaly network intrusion detection method in network security based on principle component analysis

    Get PDF
    Most current intrusion detection methods cannot process large amounts of audit data for real-time operation. In this paper, anomaly network intrusion detection method based on Principle Component Analysis (PCA) for data reduction and classifier in presented. Each network connection is transformed into an input data vector. Moreover, PCA is applied to reduce the high dimensional data vectors and distance between a vector, and its projection onto the subspace. Based on the preliminary analysis using a set of benchmark data from KDD (Knowledge Discovery and Data Mining) Competition designed by DARPA, PCA demonstrates the ability to reduce huge dimensional data into a lower dimensional subspace without losing important information. This finding can be used to further enhance the detection accuracy in detecting new types of intrusion by taking PCA as the preprocessing requirement in reducing high dimensional data

    Mining Usage Web Log Via Independent Component Analysis And Rough Fuzzy

    Get PDF
    In the past few years, web usage mining techniques have grown rapidly together with the explosive growth of the web, both in the research and commercial areas. Web Usage Mining is that area of Web Mining which deals with the extraction of interesting knowledge from logging information produced by Web servers. A challenge in web classification is how to deal with the high dimensionality of the feature space. In this paper we present Independent Component Analysis (ICA) for feature selection and using Rough Fuzzy for clustering web user sessions. Our experiments indicate can improve the predictive performance when the original feature set for representing web log is large and can handling the different groups of uncertainties/impreciseness accuracy

    Independent Component Analysis And Rough Fuzzy Based Approach To Web Usage Mining

    Get PDF
    Web Usage Mining is that area of Web Mining which deals with the extraction of interesting knowledge from logging information produced by Web servers. A challenge in web classification is how to deal with the high dimensionality of the feature space. In this paper we present Independent Component Analysis (ICA) for feature selection and using Rough Fuzzy for clustering web user sessions. It aims at discovery of trends and regularities in web users’ access patterns. ICA is a very general-purpose statistical technique in which observed random data are linearly transformed into components that are maximally independent from each other, and simultaneously have “interesting� distributions. Our experiments indicate can improve the predictive performance when the original feature set for representing web log is large and can handling the different groups of uncertainties/ impreciseness accuracy

    Anomaly detection of intrusion based on integration of rough sets and fuzzy c-means

    Get PDF
    As malicious intrusions are a growing problem, we need a solution to detect the intrusions accurately. Network administrators are continuously looking for new ways to protect their resources from harm, both internally and externally. Intrusion detection systems look for unusual or suspicious activity, such as patterns of network traffic that are likely indicators of unauthorized activity. New intrusion types, of which detection systems are unaware, are the most difficult to detect. The amount of available network audit data instances is usually large; human labeling is tedious, time-consuming, and expensive. The objective of this paper is to describe a rough sets and fuzzy c-means algorithms and discuss its usage to detect intrusion in a computer network. Fuzzy systems have demonstrated their ability to solve different kinds of problems in various applications domains. We are using a Rough Sets to select a subset of input features for clustering with a goal of increasing the detection rate and decreasing the false alarm rate in network intrusion detection. Fuzzy c-Means allow objects to belong to several clusters simultaneously, with different degrees of membership. Experiments were performed with DARPA data sets, which have information on computer networks, during normal behavior and intrusive behavior

    Integrating genetic algorithms and fuzzy c-means for anomaly detection

    Get PDF
    The goal of intrusion detection is to discover unauthorized use of computer systems. New intrusion types, of which detection systems are unaware, are the most difficult to detect. The amount of available network audit data instances is usually large; human labeling is tedious, time-consuming, and expensive. Traditional anomaly detection algorithms require a set of purely normal data from which they train their model. In this paper we propose an intrusion detection method that combines Fuzzy Clustering and Genetic Algorithms. Clustering-based intrusion detection algorithm which trains on unlabeled data in order to detect new intrusions. Fuzzy c-Means allow objects to belong to several clusters simultaneously, with different degrees of membership. Genetic Algorithms (GA) to the problem of selection of optimized feature subsets to reduce the error caused by using land-selected features. Our method is able to detect many different types of intrusions, while maintaining a low false positive rate. We used data set from 1999 KDD intrusion detection contest
    corecore